Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Topic clause identification method based on specific features
JIANG Yuru SONG Rou
Journal of Computer Applications    2014, 34 (5): 1345-1349.   DOI: 10.11772/j.issn.1001-9081.2014.05.1345
Abstract318)      PDF (739KB)(356)       Save

When identifying the Topic Clause (TC) of Punctuation Clause (PClause), the brute-force method to generate Candidate Topic Clause (CTC) causes high time consumption and low accuracy of the identification system. A new CTC generating method was proposed, which used specific features such as the PClause location in the text, the grammatical features of the topic and the adjacent features of topic and its comment. The experimental result shows that the improved method can not only improve the efficiency of the system by reducing the number of CTCs, but also make the accuracy of TC identification for single PClause and PClause sequence increase by 0.96 percentage points and 1.31 percentage points respectively over the current state.

Reference | Related Articles | Metrics
Disambiguation of domain word segmentation based on unsupervised learning
XIU Chi SONG Rou
Journal of Computer Applications    2013, 33 (03): 780-783.   DOI: 10.3724/SP.J.1087.2013.00780
Abstract753)      PDF (629KB)(520)       Save
Domain word segmentation is much more difficult than general word segmentation in Chinese natural language processing. The segmentation ambiguity has been lack of effective solution especially. Concerning this problem, an unsupervised learning method for domain segmentation ambiguity was proposed. String frequency, mutual information and boundary entropy were selected as evaluation standard for segmentation ambiguity. Individual and combination of these three kinds of information were used to solve the problem. The experimental results suggest that the proposed can solve the domain segmentation ambiguity efficiently and effectively.
Reference | Related Articles | Metrics